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ABSTRACT 


We  consider  a  stochastic  dynamic  system  which  is  governed  by  a  multidimensional 
diffusion  process  with  constant  drift  and  diffusion  coefficients.  The  correction  corresponds 
to  an  additive  input  which  is  under  control.  There  is  no  limit  on  the  rate  of  input  into 
the  system.  The  objective  is  to  minimize  the  expected  cumulative  cost  associated  with  the 
position  of  the  system  and  the  amount  of  control  excerted. 

It  is  proved  that  Hamilton- Jacobi-Bellman’s  equation  of  the  problem  has  a  solution, 
which  corresponds  to  the  optimal  cost  of  the  problem.  An  existence  of  optimal  policy  is 
proved.  ? 
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1.  INTRODUCTION 


This  problem  is  motivated  by  studies  of  a  dissipative  system  under  uncertainty.  A 
typical  model  would  be  an  automatic  cruise  control  of  an  aircraft  subject  ot  uncertain  wind 
conditions.  The  problem  is  to  balance  costs  associated  with  deviation  of  the  airplane  from 
the  prescribed  course  and  fuel  expenditures  resulting  from  the  correction  of  the  course. 

We  assume  that  in  absence  of  control  the  fluctuations  of  our  stochastic  system  are 
described  by  a  multidimensional  Brownian  motion  with  constant  n-dimensional  vector  - 
drift  g  and  n  x  n  diffusion  matrix  a. 

yx(t)  =  x  +  gt  +  ow(t) 

Here  x  is  the  initial  position,  and  tu(-)  is  a  n-dimensional  standard  Brownian  motion 
on  (n ,7,7t,P). 

The  "quality”  of  the  position  of  the  system  is  measured  by  a  function  h.  We  assume 
that  h  is  a  strictly  convex  nonnegative  function  such  that 

h{x)/\x\  — ♦  oo  as  |z|  — ►  oo. 

The  control  is  realized  by  2 n  increasing,  Jt  -adapted  processes  vf  {t) ,  (t) ,  i  =  1,2,..., 

n.  The  control  functional  v{t)  is  a  n-dimensional  Jf-adapted  process  of  bounded  variation 
defined 

v[t)  =  {isi{t)>V2{t),...,vn{t)),  (1.1) 

mo  =  ^(0-"r(0-  (i-2) 

The  dynamics  of  the  system  under  control  is  then 


y*(0  =  x  +  gt  +  atu(f)  +  u(t). 


(1.3) 


With  each  initial  position  x  and  each  control  functional  u  we  associate  a  cost 

=e{J°°  t~ath{yx{t))dt  + 

n  ^  ^  (1.4) 

]£  Jo  *~atdt/? (<)  +  bi  c~atd»- (*)]  }, 

where  a*  and  6»,  i  =  1,2, ...,»  are  positive  constanst  and  a  >  0  is  a  discount  factor. 
Denote  by  V  the  set  of  all  n-dimensional  Jt -adapted  processed  v  represented  in  the  form 
(1.1),  (1.2).  We  are  looking  for 

u(x)  =  inf {Ja(i/)  :  i/ t  V)  (1.5) 


and  v*  such  that 


u(x)  =  Jx(j/*). 


Let 


[o,  y]  =  a  a 


and 


^  dxn  )  •  ^>ut 


1  dz  ° 

2^  aijd^d73  ~  ^9id7t  +  a  = 

i,j—  1  •—  1 


-  -tr(co*V  )  -  ff  V  +a. 

For  q  =  (9i,92,...,?n)  <Rn  let  ||9||  =  max  j9,  |,  Put 


(1.6) 


(1.7) 


(1.8) 


2 


M  =  max  +  ft)  “  ft(6*'  -  ft)] 

tl€lMWII<i  ,=  1 


Note  that 


(}{q)  >  0  and  0(g)  =  0  iff  -  a,-  <  g,  <  6,-, »'  =  1, 2, . . . ,  n. 


(1.10) 


We  will  show  that  the  optimal  cost  function  u  given  by  (1.5)  satisfies  that  following 
Hamilton-Jacobi-Bellman  equation 


max(i4u  —  /i^fyu)  —  l)  =  0, 


(1.11) 


where  qf(g)  =  maxi<t<n((g^ /at)  V  (qi  /6» ) ] ,  and  we  prove  the  existence  of  the  optimal  cost. 

In  section  2  we  consider  a  family  of  problems  in  which  the  allowable  controls  are  ab¬ 
solutely  continuous  with  the  rates  uniformly  bounded.  We  derive  estimates  for  the  cost 
functions  of  these  problems.  In  section  3  it  is  shown  that  a  subsequence  of  cost  functions 
for  absolutely  continuous  control  problems  converges  to  u(x )  given  by  (1.5).  Section  4  is 
devoted  to  construction  of  the  optimal  policy. 

2.  Absolutely  continuous  control  problems. 

Here  and  in  sequel  we  assume  that  the  function  h  is  strictly  convex  and  roughly  speaking 
is  of  polynomal  growth.  More  precisely  there  exist  p  >  1  and  constants  Co,Ci  and  Co 
such  that  for  any  0  <  X  <  1,  any  x  t  Rn  and  any  x'  such  that  \x'\  <  1 , 


0  <  h[x)  <  C0(l  +  |x|)'\ 


\h(x)  -  h(x  +  x')\  <  Ci(l  +  h(x)  +  h(x  +  x')Y  p  l]x'J, 


■•a  »■*» 


0<h{x  +  \x')  +  h{x-\z')-2h[x)<C2\‘1(\+h{x))\  g=[l--)  +  .  (2.3) 

P 


Let  Vt  be  the  set  of  all  u  e  V  such  that 


i /(*)  =  f  u(s)ds ,  |x/"(s)|  <  £  1  for  all  s  >  0, 

Jo 


(2.4) 


and 


ut{x)  =  inf  J» 

utVt 


(2.5) 


Formal  application  of  the  dynamic  programming  principle  yields  the  following  equation 


Aut  +  e  l/?(yu,)  =  k 


(2.6) 


(see  Fleming  and  Rishel  (1975)).  The  next  theorem  establishes  the  properties  of  u,  which 
will  be  used  in  sequel. 


(2.7)  Theorem.  Suppose  h(x)  satisfies  (2.1)  -  (2.3).  Then  there  exist  Co,Ci,C2  indepen¬ 
dent  of  ee(0, l]  such  that  for  each  Ae(0, 1)  and  each  x'  with  |x'|  <  1  the  function  ue(  1) 
satisfies  (2.8)-(2.10)  below 


0<  u,(x)  <  C0(l  +  |x|)p, 


(2.8) 


u,(x)  -  ut(x  -I-  x')|  <  Cx(  1  +  |x|  +  |x  +  x'\Y  1|x'|, 


(2.9) 


0  <  u,(x  +  Ax')  +  ut(x  -  Ax')  -  2u,(x)  <  C2A2(1  +  |x|)*''  2*+ .  (210) 


Moreover  (2.8)-(2.10)  is  also  true  for  u. 


Proof  1°.  Putting  v  =  0,  we  get  from  (2.5) 


“«(*)  <JX{ 0)  =  E^J  e~ath(x  +  aw(t)  +  gt)dt} 
<E^  J  e~atC0[  1  4-  \x  +  aw(t)  +  gt\p)dt} 
<Cqcc~1  +  CqE^  J  e-at|x  +  aw(t)  +  gt\p)dt} 


<CQa  1  +  C0a_1e|xjp, 

where  c  is  a  constant  dependent  on  a, a  and  g.  Putting  C0  =  CQa~l  max(l,  c) 

(2.8). 


2°.  Consider 


u*(i)  -  ut[x  +  x')  =  inf  sup  Jx[v)  -  JI+I-( v') 

v  «/' 

<sup  Jx+x>[y')  -  Jx[v')  =  sup  Jx(l/)  -  Jx+a,(l/). 

I/'  1/ 

Likewise  u,(x  +  x')  -  u,(x)  <  sup „JX+X'(v)  -  Jx( u) .  Therefore 

|u,(z)  -  u,(x  +  z')|  <  sup \Jx{u) 

Jx+x'  HI 

v 

<sup  E  {/;  «““*IMy*(0)  -  fc(y«+*'(0)l*} 

<  sup  CiEl  r  |1  +  %.(*))  -  /i(yx+a;<(t))|1-1/,> 
v  W  o 

l(y*(0  -  (y*+«'(0le"at<ft} 

<Ci\x'\a~llp &uy>{eI  /°°|1  +  My«(0) 

V  (  Jo 

+h{yx+x,(t))\c-atdt})l~l'p. 


,  we  get 


(2.11) 


The  last  inequality  in  (2.11)  is  due  Holder’s  inequality.  By  virtue  of  (2.8),  we  can  consider 


only  those  in  (2.11)  u  for  which  E /“  h(yx(t))e~atdt  <  Co(l  +  |x|)p.  Therefore,  applying 
Holder’s  inequalitity  to  the  last  line  of  (2.11)  once  more, 


|u,(x)  -u,(x  +  x')j  <  |x#jCia  *(2  +  C0/a)(l  +  |z|  +  |x  +  x'|)p  1, 
whereas  (2.9)  follows. 

3°.  Since  h  is  a  convex  function,  the  function  Jx[v)  is  convex  in  (x,  u) .  Because  Ve  is  a 
convex  set  the  function  u,(x)  is  convex  as  well.  Therefore  the  first  part  of  the  inequality 
(2.10)  follows.  Put  xi  =  x  +  Ax',x2  =  x  —  Ax' 

ut(x  +  Ax')  +  u«(x  —  Ax')  -  2u*(x) 

=  inf  inf  sup{  JXl  (i'l)  +  JXi  (i/2)  ~  2  Jx[u) } 

Vi  V 

(2.12) 

<  sup  JXl  [u)  +  JX2  (u)  -  2Jx{u) 

1/ 

OO 

e-atC2A2(l  +  h(yI(i))),dO. 

If  q  =  0  then  (2.12)  implies  (2.10)  in  an  obvious  manner.  If  q  >  0  (that  is  p  >  2)  then  by 
Holder’s  inequality 


s{  rv^i + %.(<)))’<«} 

jH  «-*|l  +  MMO)]) 


(2.13) 


<a-2/p(a"fl  +  C0(l  +  |x|)p)(p'2,/p 

The  last  inequality  in  (2.13)  is  due  to  (2.8).  Simple  analysis  show  that  (2.13)  implies  (2.10) 
The  proof  of  (2.8)-(2.10)  for  u  is  the  same. 

(2.14)  Theorem.  The  optimal  cost  u*(x)  is  the  unique  solution  of  (2.6)  under  the  conditions 


uwtfwwmuwww  uniw  wrwnon PTWOTPrwnw  **  1C  n."  it  v  *,■  *■  sr  *f  *r  v  v  v  wucw  nrwv,  wjwwmnwrmrjT  w.  wjvwvm 


Proof  1°.  For  a  nondegenerate  a  the  existence  of  a  solution  follows  from  the  classical 
results  (see  Fleming  and  Rishel  (1975)).  If  a  is  degenerate  then  consider  eg  =  [a,  6I\  which 
is  a  2  nxn  matrix.  Note  that 


o6o*6  =  oa*  +  62I 

is  nondegenerate  for  all  sufficiently  small  6 .  Consider  a  new  optimization  problem  in  which 
w  in  (1.3)  is  a  2n-dimensional  standard  Brownian  motion  while  a  is  replaced  by  og .  Let 
ut'6  he  the  corresponding  optimal  cost  given  by  (2.5).  Then  utg  satisfies 

2 

V  +  Autj  +  er-^vu,^)  =  h  (2.14) 

Repeating  step  by  step  the  proof  of  the  Theorem  (2.7),  we  can  see  that  (2.8)-  (2.10)  hold  with 

AAA 

C01C11C2  independent  of  6  >  0  for  all  sufficiently  small  S .  If  follows  that  u,itf(x),  Vu*,«(x) 
and  ^4u,^(x)  is  locally  uniformly  (in  6)  bounded  in  x.  The  latter  implies  existence  of  a 
subsequence  6k  — ►  0  such  that  |  A  tit,gk(z)l  is  locally  bounded  in  x  and 

u*,s„(x)  -»  u,(x), 

V««,«*(z)  -»  V«*(z), 

locally  uniformly  in  x  and  Autigk  — »  Aut  as  a  distribution  (in  Schwartz’  sense).  Passing  to 
a  limit  in  (2.14),  we  get  the  validity  of  (2.6)  for  ur. 


I 


!.*> 


2°.  To  prove  uniqueness  assume  that  there  are  two  solutions  us  and  vt  of  (2.6).  Let 


A  _  1  V'  d2  ^ 

A°-  A^9i 


2  4  dxidx  ,• 

».j  J 


dxi 


(Recall  A  =  Aq  +  al).  Then 


A0[ut  -  v,)  =  —a(ut  -  v.)  +  e  ^(v^,)  -  /?(Vu,)). 

7 


(2.15) 


Let  W[x)  =  tt,(x)-v,(x)  and  tw(x)  =  W(x)0(x,  A) ,  where  rp(x.  A)  =  (A  +  |x|2)~p  (the  value 
for  the  constant  A  will  be  chosen  later).  If  follows  from  (2.8)  that  t/;(x)  — ►  0  as  |x|  — *  oo. 
Suppose  ttf(x)  ^  0  and  w(x)  >0  for  some  x.  (If  tu(x)  <  0  then  consider  vt  —  ut).  Let 

tw(xo)  =  maxttf(x)  >  0.  (2.16) 

Calculations  show  that 


A0ip(x,  A)  +  [tr(oo*  V  y  0(ar,  A)*)]/^>(a:,  A)  =  <5(x,  \)ip{x,  A),  (2.17) 

where  sup^  |5(x,A)|  — ►  0  if  A  — ♦  oo,  and 

^(W.)  ~  /?(Vt*«)  =  Tf(x,c)[Vt»,  -  V««]  =  -7(x,c)  V  W(x),  (2.18) 

where  ||'7(x,e)||  <  1.  In  view  of  (2.16),  S?w(xo)  =  0.  Therefore 

\/W(x0)rp(x0,X)  =s  -IV(xo)  V  tMxo,*).  (2-19) 

and 

(xo)  =  ~W (x0)  V  V»(xo,  A)/V»(xo,  A).  (2.20) 

Since  S/ip(x0,  A)/V>(x0,  A)  — ♦  0  as  A  — »  oo,we  can  combine  (2.20)  and  (2.19)  to  get 

P[Vve[xo))  -  (3[S7ue{x 0))  =  IV(xo)5(e,A),  (2.21) 

where  sup0<t<1 1<5 (e.  A) |  — ►  0  as  A  — >  oo.  Also, 


A0w(x0)  =  AqW (x0)t/»(x0,  A)  +  W (x0) A0^(x0,  A)  -tr(oo*  VW(x0)  VV'(x0,A)).  (2.22) 


Applying  to  the  first  term  in  the  right  han d  of  ■  1‘  2.‘  .... 

(2.21),  (2.17)  and  (2.19),  we  get 

A0t^(x0)  =[-aW'(i0)  -4  t~  1 VA’  (^(1)A  (e ,  Ajjt,  (x,,,  /  ■  i.  -l: 

-tr(oo'(-  y  0(xo,  A)W'(x„)  V  v(xo  A)  vix.  ^ 

=  W (xo)^(xo,  A)(  -q  ■+  e  "  ‘i(e.  A)  -r  J(x0. A;' 

Now  choose  A  >  0  such  that  -a  -+  e~  '6{e,  A)  ■+  h  (x0,  A)  ^  0  In  view  of  2  l<  *  >. 
this  implies  te(xo)  <  0  which  is  in  contradiction  with  (2. 10). 

3.  Solution  of  the  Hamilton-Jacobi-Bellman  equation 

Passing  to  a  limit  in  (2.6),  as  e  — »  0  (and  assuming  for  a  moment  converger.ee  of  u,  >. 
and  Aut  to  u, yu  and  Au  respectively),  we  get  inequalities 

Au[x)  <  h,  (3.1 

-a*  <  Vu,(x)  <  t>i-  (3.2 

Assuming  also  that  at  each  point  x  at  least  one  of  (3.1)  and  (3.2)  is  tight,  we  get  ( 1 . 1 1 ) 
In  this  section  we  will  show  that  u  given  by  (1.5)  is  actually  a  solution  of  (1.11). 

(3.3)  Theorem.  There  exist  Ci,C2  >  0  such  that 

Ci|x|  <  u(x)  <  C2(l  +  |x|).  (3.4 

Proof  1°.  For  xcRn  put 


Vi*)  = 


x/|xj-x,  if  |x|  >  2, 
0  otherwise. 


(3.5) 


MO  =  lWl|«|>2  +  ^  n{Zrk), 

Tk<t 


where 


To  =  0, 

k-i 

Tk  =  inf{t  >  rfc_!  :  |x  +  gt  +  aw(t)  +  rn\  =  2}. 

n=l 

The  policy  £>x(t)  acts  in  the  following  manner.  When  the  process  is  outside  the  ball 
of  radius  2  it  is  instantaneously  moved  inside  the  ball  of  radius  1.  Then  there  is  no  action 
until  the  process  reaches  the  boundary  of  the  ball  of  radius  2,  at  which  moment  it  is  moved 
again  into  the  ball  of  radius  1  and  so  on.  Let 


U(x)  =E{  f  e  ath(x  +  gt  +aw(t)  +  i>x(t))dt 

Jo 

oo  n 

+ E  «■"*  E  b<M(M  -  Mfo-))++ 

k-l  i-1 

bi(Pxi(rk)  -  uxi{rk  ))"]  J,  if  |x|  <  2. 


Then 


U(x)  if  \x\  <  2, 

u{x!\x\)  +  Er=xK(®/l;cl  ~x)t  +bi{x/\x\i-  x)~)y  if  |x|  >  2. 


(3.6) 


It  is  easy  to  see  that  U[x)  is  a  continuous  function  and  therefore  bounded  in  {x  :  |x]  < 
2}.  Thus  formula  (3.6)  implies  that  for  ux  given  by  (3.5),  Jx{0x)l\x\  is  bounded,  whereas 
the  second  inequality  in  (3.4)  follows. 


2°.  Let  a  =  min(a,-  A6»,t  =  1,2, ...,n).  Let 


Ar  =  {|^t  +  <no(t)|  <  R  for  all  0  <  t  <  1} 
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Choose  R  such  that 


PU Is)  >  5. 


and 


h{x)/\x\  >  a  for  all  \x\  >  R.  (3.7) 

Let  |x|  >  3JZ  and  v  be  any  policy. 

Put  B  —  {|yjt(f)|  =  | x  +  gt  +  ow(t)  +  i/(t)j  >  |x|/3  for  all  t  <  l}.  Then 


J,(x/)>e  lE{jQ  Hyx(t))dt  +  a^2[^(l)  +  i'i  (l)]}  > 

*~lE{fo  h{yx{t))lBdt  +  «£>*(l)  +  ^r(l)](l  ~  1b))1*„  }• 


(3.8) 


We  have  1bMi/*(0)  >  1b<*|x|/3  by  virtue  of  (3.7).  On  Ar  the  quantity  |x  +  gt  +  au>(f)| 
exceeds  2|x|/3  for  all  t  <  1.  Therefore  on  B  D  Ar 


Hence  (3.8)  exceeds 


D^d) + ".“(i))  >  w/3- 

»=i 


t  J  {a\x\/Z)lBdt  +  (a|x|/3)(l  -  1/0)1**)}  >  c  l(a|x|/3)P(>?n)  >  e  xa|x|/6  (3.9) 
Inequality  (3.9)  implies  the  first  inequality  in  (3.4). 

(3.10)  Theorem.  There  exists  a  constant  c  and  a  sequence  e*  j.  0  such  that  for  any  R  >  0 
there  exists  N  such  that  for  every  k  >  N 
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utk(x)  <  c(  1  +  |xj)  for  all  jx|  <  R. 


Proof.  Consider  a  space  “V 


V  =  {»:»€  I  v«|£L!>, 

where  rj)  =  [\  +  \x\ =  (A  -f  \x\2)~p~n+1  and  L ^  denote  the  set  of  functions  v  on 
Rn  for  which  v2V>  is  integrable. 

Inequalities  (2.8)-(2.10)  show  that  ut(x)t  yu«(x)  and  || ~^fr^  11  are  uniformly  bounded 
in  e  on  every  compact  subset  of  Rn .  Also  the  same  inequalities  show  that  u,  is  uniformly 
bounded  in  L^,  |  V««|  is  uniformly  bounded  in  and  [[  [|  is  uniformly  bounded  in 

L>. 0  •  Hence  there  exists  a  function  UqcV  and  a  subsequence  e*.  such  that 


utk  — *■  uq  weakly  in  "V, 


(3.11) 


Autk  — »  Auo  weakly  in 

and  ej~  ~  h-  Autk  is  bounded  in  L Therefore 


(3.12) 


lim  /?(Vu*J  =  0.  (3.13) 

K—+  OO 

Since  convergence  of  S7utk  is  locally  uniform,  by  virtue  of  (1.10),  for  very  R  there  exists 
N  such  that  for  every  k  >  N 


—a.%  —  S  <  V uthi[z )  <  &,•  +  6  for  all  |x|  <  R. 


(3.14) 


Since  u«fc(x)  is  monotonically  decreasing,  (3.14)  implies  the  statement  of  the  theorem. 


12 


*  l  i.i  i  I  t  i  i 


(3.15) Proposition.  Let  Vo  denote  the  set  of  e  V  such  that  i/(0)  =  0.  then 

u(x)  =  inf  Jx{u). 

vtV o 

Proof  1°.  We  may  consider  only  those  u  for  which  Jx[u)  is  finite.  First  show  that 
in  the  mimimization  problem  (1.5)  we  can  consider  only  those  u  for  which  there  exists  r 
(possibly  dependent  on  x)  such  that 


Mo)  |  < 


(3.16) 


Let  0X  be  a  policy  given  by  (3.5)  for  which  JX(C>X)  <  £2(1 +  |*|)  •  For  any  policy  u  and 
initial  state  x  consider  a  policy 


S(t)  = 


u{t)  if  1^(0)  |  <  r, 
0x(t)  if|i/(0)|>r. 


If  |r|  >  |x|  then  |r/r(0)j  <  r.  Using  the  first  inequality  in  (3.4), 


JxM  ~  Jx{vr)  >  -E{ci|ya(0)|  -  c2(  1  + 1*!);  H°)l  >  r} 

>  (cx(r  -  |x|)  -  c2(l  +  |xj))P(|i/(0)|  >  r}. 
If  r  >  (l  +  |i|)c2/c1  +  | x | ,  then  Jx[ vT)  <  JXW)  while  |i^r(0) |  <  r. 


(3.17) 


Likewise,  we  can  show  that  every  policy  is  dominated  by  the  one  for  which  for  every 
stopping  time  r 


W{vAt))  ~  */(y*(7'-))l  <  (1  +  |yx(r-)|)c2/Cl  +  |yi(r— ) 
2°.  Let  v  be  any  policy  subject  to  (3.16)  and  (3.18)  and  let 


(3.18) 


P(t;e)  = 


i/(t)  if  t  >  e, 

u(0)t/e  +  v(t)(t  -  e)/e  if  t  <  e  . 


S  «  V  V  '  JI  *ji  S  %  ’  V  f  ----- 


r.  ✓  *  .  •  -•  .-  .\  ------  .  .  .  -•  •  ,  -■  ■  ,  .  r 

-* *  •  v.v*y  v'V \-Ivpv,v v  .  vVv vV 
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The  term  under  expectation  in  Ji  is  majorized  by  /0°°  e~ath(yx(t))dt  which  has  finite 
mean  by  assumption.  Thus  Ii  — ►  0  as  e  — »  0  by  virtue  of  the  dominated  convergence 
theorem.  Since  yx{t\e)  =  yx(t)  on  {*  <  r},  we  have 

|y«(<;c)l  =  |y*(OI  <  |*|  +  r  +  l  on  {*<e<r}. 

Therefore  /2  does  not  exceed  E{Me\r  >  c}  <  Me  where  A/  =  maxivl^jxl+r+i  h(y). 

By  assumption  J,( u)  <  oo  therefore,  by  virtue  of  (3.19)  and  (3.16)  and  the  dominated 
convergence  theorem,  I3  — ►  0.  By  virtue  of  (3.18) 

|yx(r;e)|  <  |x|  +  r  +  1  +  (1  +  |z|  +  r  +  l)c2/ci  (3.21) 

In  view  of  (3.6)  and  the  strong  Markov  property  for  yx(-\e),  we  get  that  /4  does  not  exceed 
JE{M£t  +  c2(1  +  |ya:(r;e)|);r  <  e}.  Therefore  (3.19)  and  (3.21)  imply  J4  — ►  0  as  e  — ►  0. 
On  the  set  { r  >  c)  the  Functional  <  e  is  bounded  by  JxJ  +  r  +  1.  Likewise  for 

i/f(t;e).  Straightforward  verification  shows  that  both  /*  c~atdi/f  (t)  and  f*e~atdi/f(t;e) 
converge  to  t/(0)  as  e  — ♦  0.  Therefore  by  the  bounded  convergence  theorem  — ►  0  as 
e  — »  0.  Similarly  for  I 6 . 

Let  V  =  U«>0  K. 

(3.22)  Theorem  The  set  V'  is  dense  in  V0  that  is 


5) 

X 

u 


a 


inf  Jx{u)  =  inf  J,(i/). 

i/cVo  i/cV" 


Proof  1°.  Let  vtV  such  that  Jx{v)  <  00. 


3 


^  (*’*)  1 


if  0  <  t  <  6, 
if  t  >  6. 
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It  is  obvious  that  u[t\S)  =  [i/f(t,6)  —  Vi(t,6), 
functional  with 


—  u~(t,S ))  is  a  continuous 


I^M)!  <  n6' 


It  is  also  clear  that 


<  ^(0 


(3.23) 


and  u{t\6)  — ►  i/(t)  as  6  — >  0  for  all  t  except  possibly  a  countable  set  of  the  points  of 
discontinuity  of  u.  To  justify  convergence  of  Jx(i/(-,6))  to  Jx( u)  we,  however,  need  several 
extra  steps  as  well  as  a  further  modification  of  i/(*,  £). 


Proof  2° .  Let  yx  be  the  trajectory  associated  with  v  and  yx  be  the  trajectory  asso¬ 
ciated  with  i/(-,  S)  and  let 


tr  =  inf{*  :  |yx(f)|  >  R}, 


4=inf{t:|yf(0l>*}. 


Fix  R.  By  virtue  of  the  Theorem  (3.10)  there  exist  e  >  0  and  rjx(t)eVt  such  that 


Jx{Vx)  <  c(l  +  |z|)  for  all  |z|  <  4nR. 


(3.24) 


For  every  6  less  than  e  above,  put 


c(R)  =  tr  A  rfnR  A  inf  {t  :  max  uf  ( t )  V  max  i/~  ( t )  =  N} 


(the  constant  N  will  be  chosen  later)  and  put 


P[t,6)  = 


*M), 


if  t  <  o(R), 


v(o(R),6)  +  t) -  o[R)),  if  t  >  a{R). 


The  policy  P(t,6)  coincides  with  policy  which  approximates  the  original  policy  u 

until  either  the  process  |y*(t)|  reaches  R  or  the  process  |y*(t)|  reaches  the  level  4 nR  or  one 
of  the  control  functionals  i/j^(*)  exceeds  N .  After  that  £(•,  6)  switches  to  the  policy  whose 
expected  cost  grows  with  x  at  the  rate  not  exceeding  |x|.  It  is  obvious  that  i/(-,S)eVs.  In 
the  sections  3°  and  4°  we  will  show 

Im  J.(PM))  <  (3-25) 

6—>0 
R—oo 

3°.  From  the  definition  of  i '(•,£)  it  is  clear  that 

t'f(t-)  <  fimj/f  (t,6)  <  u?  (t). 

6  —  0 

Similarly  if  tn  — *  t 

i/*(t-)  <  lilB  (3-26) 

Suppose  that  with  positive  probability 

<  tr-  (3-27) 

S->0 

Let  w(t)  be  the  trajectory  for  which  (3.27)  holds.  Then  there  exists  a  sequence  Sri  [  0 
and  a  bounded  sequence  tn  <  tr  such  that 


By  choosing  a  subsequence  if  necessary,  we  may  assume  tn 
on  {tn  <  tr },  (3.28))  implies 


(3.28) 

t.  Since  |y*,-(tM)|  <  R 


lim  |yJr(<»)  -  y«(*«)l  =  lim  -  ^(Ol  >  3R- 


( 3.29 ) 


However  i/»  is  continuous  from  the  right  and  has  left  limits.  Therefore 


^*(<-)  <  limi/*(t„)  <  (3.30) 

t.— t 

Inequality  (3.30)  and  (3.26)  show  that 

-  ^(*n)|  <  MO  -  M*-) I  (3-31) 

however,  |t/;(t)  —  i/*(t— )|  =  |y*»(t)  —  y*»(*~)l  <  2i2  on  the  set  {t  <  r^}.  The  latter 
contradicts  to  (3.31)  and  (3.29).  Therefore  the  probability  of  (3.27)  is  null. 

4° .  For  any  policy  1/  put 

j*(*.«')  =  J  *-atMy* (*))<**  +  E  [a»  J  +  b<  J  e~atdis~  («)] . 

Since  tr  |  00  as  R  — ♦  00,  we  can  apply  the  dominated  convergence  theorem  to  obtain 

J»W)  =  Jim  E{jx[TR,u)}, 

it—*  OO 

which  implies 

Jim  E{jx[oo,u)  -jx[TR,v)}  =  0. 

it— ^OO 

Applying  the  strong  Markov  property  for  yz{  )  and  the  first  part  of  the  inequality  (3.4),  we 
get 

E{e~TRcR}  <  £{e-T,'e|ya(7-rt)|}  — ►  0  a s  R  — +  00.  (3.32) 


Consider 


rvjrv  wv  ptv  ifyirgwNrrawqvvy’a  we  wsj  r*  wj  TO  w  ttj  rew^nrj  T?  wv  vywyirjTV  wv  ^Virj  tv^v^^v  *v  ^  irj^»v^^^FVFnifF-*^ 


sN 


j4p(-,<»  =.e{j;(oo,pM))} 

=  £{y,w«),c(-,<))}+£{i,(oo,p(-,«))  -i,W*).p(-,«))} 


In  view  of  (3.32)  there  exists  R  such  that 


e(l  + 4n.R).E{e  T*}  <  e. 


Then  choose  N  such  that 


c(l  +  4nR)P{maxi/i’ (r^)  V  ma xu^(tr)  >  N}  <  e. 


since  (3.27)  does  not  hold  a.s. 


o(i2)  — *  tr  A  inf{t :  max  i vf ( t )  V  max  i \  ( t )  =  N}  =  f ,  as  6  — ►  0. 


Since  p(t,  6)  — »  v[t )  for  all  t  <  f  except  a  countable  number  of  t 


(3.33) 


(3.34) 


(3.35) 


(3.36) 


Moreover  the  convergence  in  (3.36)  is  bounded  because  |yx(OI  <  R  an(^  |yf(OI  < 
and  uf  (<;  6)  <  (t)  <  N  if  t  <  o(R).  Therefore 


lirn  E{jx(o(R),  &[•,£)}  -»  E{jx{t,u)}  <  Jx(u). 

o—+  0 


(3.37) 


In  view  of  (3.24)  and  the  strong  Markov  property  for  ysx  the  second  term  in  (3.33)  does 


not  exceed 


<  c{l  +  4R)E{e'a'TW} 

<  c(l  +  4iE)  [£{e"aT*;TR  =  a(JE)} 

+E{e-a<R);TR  >  r*R}  (3.38) 

<  c(l  +  4R)E{e~aTR}  +  c(l  +  4 R)P{tr  >  t6ar} 

+c(l  +  4/2)P{rH  >  0- 

The  first  term  in  (3.38)  does  not  exceed  e  by  virtue  of  (3.34).  Since  (3.27)  holds  with 
probability  zero  the  second  term  in  (3.38)  can  be  made  smaller  than  e  when  S  is  sufficiently 
small.  The  third  term  in  (3.38)  does  not  exceed  e  in  view  of  (3.35).  In  view  of  arbitrariness 
of  e,  we  get  (3.25).  The  statement  of  the  theorem  is  a  trivial  consequence  of  (3.25). 

(3.39)  Corollary.  For  every  x 

lim  ut( x)  =  u(x). 

The  proof  of  this  theorem  follows  from  Proposition  (3.15)  and  Theorems  (3.22). 

(3.40)  Theorem.  The  optimal  cost  u  given  by  (1.5)  satisfies  (1.11). 

Proof.  The  proof  of  Theorem  (3.10)  shows  that  there  exists  a  function  uo  for  which 
(3.11)  and  (3.12)  hold. 


Since  fi(q)  is  a  continuous  function  of  q  and  Vu«*  — *  Vu  we  have 


/?(Vu0)  =  0 


(3.41) 


and  in  view  of  (1.10)  we  get  (3.2).  From  (2.6) 
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Aut  <  au,  +  e  1/3(VU«)  =  h 


and  it  follows  (after  passing  to  a  limit  as  e*  — ►  0) 


Auq  <  h 


Suppose  '/(V«o(^o))  <  1,  that  is 


— a»  <  Vuo(io)»  <  bi  for  all  1  <  i  <  n.  (3.42) 

By  virtue  of  the  continuity  of  yuo,  the  inequality  (3.42)  is  true  for  all  \x  —  xq\  <  6  for 
some  S  >  0  since  convergence  of  Vu«*  to  yuo  is  locally  uniform  (see  the  proof  of  Theorem 
(3.10))  the  inequality  (3.42)  is  true  for  f°r  all  |x  — Xo|  <  &  and  all  k  sufficiently  large. 

For  such  k 


Autk  =  h 

and,  passing  to  a  limit  as  k  — ►  oo , 

^4uo  =  h,  (3.43) 

that  is  7 (yu)  <  1  implies  (3.43).  In  view  of  corollary  (3.39),  u  =  limu, ,  hence  uq  =  u  and 
the  theorem  is  proved. 


Let 


4.  Construction  of  the  optimal  policy 


A2h(i,y)  =  (h(x)  +  h(y))/2  -  h((x  +  y)/2).  (4.1) 
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By  virtue  of  (2.3) 


A2 h(x,y)  >0,  if  x  ^  y.  (4.2) 

(4.3)  Theorem.  The  optimal  policy  v*  (if  exists)  is  unique. 

Proof.  Suppose  there  are  v\  and  for  which  (1.6)  is  true.  Let  y*(t)  and  y"l[t)  be  the 
corresponding  trajectories.  Put  v  =  (i/J  +  v^]j2  and  yx(f)  =  [yx[t)  +  y*(t))/2.  Then 

u(x)  -  Jx{u)  =(J.K)  +  JxK))/2  -  Jx(v) 

r  f°°  'i  (4-4) 

>E{J  e~‘Aak(rf(«),j£(<))<#}. 

By  virtue  of  (4.2)  the  right  hand  side  of  (4.4)  is  strictly  positive  if  v\  and  1/^  are  not 
equal  a.s. 

Let  m.T  be  a  measure  on  ([0,T]  x  fl,B[0,T]  X  7)  equal  to  the  product  of  Lebesgue 
measure  and  P . 

(4.5)  Theorem.  If 


Jx{vk)  u(z)  as  k  —*  00, 
then  i/fc(t,u>)  converges  in  measure  mj. 

Proof.  1°.  Let  y*  be  the  trajectory  corresponding  to  Uk-  then 


lvS(‘)>*| 


dt 


} 


0  as  N  —>  00 


(4.6) 


uniformly  in  k.  Really 


(4.8) 
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JM>e-aTE{j^  h(ykx(t))dt } 

“C~aT  M>w  /o  l\»M\>Ndt 

Because  Jx{Uk)  is  unifonnly  bounded  in  k  and  lirnv_00  inf [-cj > ^  Mx)  =  oo,  we  get 
(4.7). 

2° .  We  need  to  show  that  for  any  e  >  0 

E{j  0  45  rn,k -+oc.  (4.9) 

Suppose  that  the  expectation  in  (4.9)  is  greater  than  6  >  0  for  all  m  and  k  (or  for  all 
m  and  k  from  a  subsequence).  Let  N  be  such  that  the  expectation  in  (4.7)  is  less  than  6/2 
for  all  k.  Then 

{JxWk)+Jz[''m))l 2  ~  Jx{{Vk  +  t'm)/2) 

7 

>  e~aTEl  f  A2h(i/^(t),5/j(0)l|j/j(o|<^1|vr(‘';<^’ 

0  .  , 

(4.i°) 

>  e~aT p[e, n)e^J^  } 

e~aTp(e,N)6/ 2, 

where  p(e,  N)  =  infj.r_y|>,;|.r|i|y|<w  A2/i(x,y) .  (Usual  continuity /compactness  arguments 
show  that  p{e,N)  >  0  for  any  N).  Inequality  (4.10)  and  (4.6)  imply  ^  u m )/2)  < 

u(x)  and  we  come  to  a  contradiction. 

(4.11)  Corollary.  There  exists  an  optimal  policy  v* . 

Proof  Taking  a  sequence  i/*  for  which  (4.5)  is  true,  and  using  a  diagonal  method,  we  can 
find  a  subsequence  i/fctn  which  converges  a.e.  mj-  for  each  T  >  0.  Usual  argument  show  an 


existence  of  u  such  that 


=  \imukfn  (t,w) 


(4.12) 


for  Leb  x  P  almost  all  (f,w).  By  Fatou’s  lemma 


lim  J.foJ  >  J*(lim(i/fcJ)  =  Jx[ u). 


Thus  Jx{y)  <  u(z) .  Therefore  v  given  by  (4.12)  coincides  with  v* . 
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